Learning Rational Stochastic Languages
نویسندگان
چکیده
Given a finite set of words w1, . . . , wn independently drawn according to a fixed unknown distribution law P called a stochastic language, an usual goal in Grammatical Inference is to infer an estimate of P in some class of probabilistic models, such as Probabilistic Automata (PA). Here, we study the class S R (Σ) of rational stochastic languages, which consists in stochastic languages that can be generated by Multiplicity Automata (MA) and which strictly includes the class of stochastic languages generated by PA. Rational stochastic languages have minimal normal representation which may be very concise, and whose parameters can be efficiently estimated from stochastic samples. We design an efficient inference algorithm DEES which aims at building a minimal normal representation of the target. Despite the fact that no recursively enumerable class of MA computes exactly S Q (Σ), we show that DEES strongly identifies S rat Q (Σ) in the limit. We study the intermediary MA output by DEES and show that they compute rational series which converge absolutely to one and which can be used to provide stochastic languages which closely estimate the target.
منابع مشابه
Learning Rational Stochastic Tree Languages
We consider the problem of learning stochastic tree languages, i.e. probability distributions over a set of trees T (F), from a sample of trees independently drawn according to an unknown target P . We consider the case where the target is a rational stochastic tree language, i.e. it can be computed by a rational tree series or, equivalently, by a multiplicity tree automaton. In this paper, we ...
متن کاملRelevant Representations for the Inference of Rational Stochastic Tree Languages
Recently, an algorithm DEESwas proposed for learning rational stochastic tree languages. Given a sample of trees independently and identically drawn according to a distribution de ned by a rational stochastic language, DEES outputs a linear representation of a rational series which converges to the target. DEES can then be used to identify in the limit with probability one rational stochastic t...
متن کاملRational stochastic languages
The goal of the present paper is to provide a systematic and comprehensive study of rational stochastic languages over a semiring K ∈ {Q,Q,R,R}. A rational stochastic language is a probability distribution over a free monoid Σ which is rational over K, that is which can be generated by a multiplicity automata with parameters in K. We study the relations between the classes of rational stochasti...
متن کاملA probabilistic extension of locally testable tree languages
Probabilistic k-testable models (usually known as k-gram models in the case of strings) can be easily identified from samples and allow for smoothing techniques to deal with unseen events. In this paper we introduce the family of stochastic k-testable tree languages and describe how these models can approximate any stochastic rational tree language. This is applied, as a particular case, to the...
متن کاملUsing Pseudo-stochastic Rational Languages in Probabilistic Grammatical Inference
In probabilistic grammatical inference, a usual goal is to infer a good approximation of an unknown distribution P called a stochastic language. The estimate of P stands in some class of probabilistic models such as probabilistic automata (PA). In this paper, we focus on probabilistic models based on multiplicity automata (MA). The stochastic languages generated by MA are called rational stocha...
متن کامل